27 research outputs found

    Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions

    Full text link
    Comment sections below online news articles enjoy growing popularity among readers. However, the overwhelming number of comments makes it infeasible for the average news consumer to read all of them and hinders engaging discussions. Most platforms display comments in chronological order, which neglects that some of them are more relevant to users and are better conversation starters. In this paper, we systematically analyze user engagement in the form of the upvotes and replies that a comment receives. Based on comment texts, we train a model to distinguish comments that have either a high or low chance of receiving many upvotes and replies. Our evaluation on user comments from TheGuardian.com compares recurrent and convolutional neural network models, and a traditional feature-based classifier. Further, we investigate what makes some comments more engaging than others. To this end, we identify engagement triggers and arrange them in a taxonomy. Explanation methods for neural networks reveal which input words have the strongest influence on our model's predictions. In addition, we evaluate on a dataset of product reviews, which exhibit similar properties as user comments, such as featuring upvotes for helpfulness.Comment: Accepted at the International Conference on Web and Social Media (ICWSM 2020); 11 pages; code and data are available at https://hpi.de/naumann/projects/repeatability/text-mining.htm

    ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data

    Get PDF
    RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image

    Do We Need Another Explainable AI Method? Toward Unifying Post-hoc XAI Evaluation Methods into an Interactive and Multi-dimensional Benchmark

    Full text link
    In recent years, Explainable AI (xAI) attracted a lot of attention as various countries turned explanations into a legal right. xAI allows for improving models beyond the accuracy metric by, e.g., debugging the learned pattern and demystifying the AI's behavior. The widespread use of xAI brought new challenges. On the one hand, the number of published xAI algorithms underwent a boom, and it became difficult for practitioners to select the right tool. On the other hand, some experiments did highlight how easy data scientists could misuse xAI algorithms and misinterpret their results. To tackle the issue of comparing and correctly using feature importance xAI algorithms, we propose Compare-xAI, a benchmark that unifies all exclusive functional testing methods applied to xAI algorithms. We propose a selection protocol to shortlist non-redundant functional tests from the literature, i.e., each targeting a specific end-user requirement in explaining a model. The benchmark encapsulates the complexity of evaluating xAI methods into a hierarchical scoring of three levels, namely, targeting three end-user groups: researchers, practitioners, and laymen in xAI. The most detailed level provides one score per test. The second level regroups tests into five categories (fidelity, fragility, stability, simplicity, and stress tests). The last level is the aggregated comprehensibility score, which encapsulates the ease of correctly interpreting the algorithm's output in one easy to compare value. Compare-xAI's interactive user interface helps mitigate errors in interpreting xAI results by quickly listing the recommended xAI solutions for each ML task and their current limitations. The benchmark is made available at https://karim-53.github.io/cxai

    Validation of Tagging Suggestion Models for a Hotel Ticketing Corpus

    Get PDF
    This paper investigates methods for the prediction of tags on a textual corpus that describes hotel staff inputs in a ticketing system. The aim is to improve the tagging process and find the most suitable method for suggesting tags for a new text entry. The paper consists of two parts: (i) exploration of existing sample data, which includes statistical analysis and visualisation of the data to provide an overview, and (ii) evaluation of tag prediction approaches. We have included different approaches from different research fields in order to cover a broad spectrum of possible solutions. As a result, we have tested a machine learning model for multi-label classification (using gradient boosting), a statistical approach (using frequency heuristics), and two simple similarity-based classification approaches (Nearest Centroid and k-Nearest Neighbours). The experiment which compares the approaches uses recall to measure the quality of results. Finally, we provide a recommendation of the modelling approach which produces the best accuracy in terms of tag prediction on the sample data

    Diversifying Product Review Rankings : Getting the Full Picture

    No full text
    E-commerce Web sites owe much of their popularityto consumer reviews provided together with product descriptions.On-line customers spend hours and hours going through heaps oftextual reviews to build confidence in products they are planningto buy. At the same time, popular products have thousands ofuser-generated reviews. Current approaches to present them tothe user or recommend an individual review for a product arebased on the helpfulness or usefulness of each review. In thispaper we look at the top-k reviews in a ranking to give a goodsummary to the user with each review complementing the others.To this end we use Latent Dirichlet Allocation to detect latenttopics within reviews and make use of the assigned star ratingfor the product as an indicator of the polarity expressed towardsthe product and the latent topics within the review. We present aframework to cover different ranking strategies based on theuser’s need: Summarizing all reviews; focus on a particularlatent topic; or focus on positive, negative or neutral aspects.We evaluated the system using manually annotated review datafrom a commercial review Web site.Winner of best paper award at 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.QC 2012021

    Diversifying Product Review Rankings : Getting the Full Picture

    No full text
    E-commerce Web sites owe much of their popularityto consumer reviews provided together with product descriptions.On-line customers spend hours and hours going through heaps oftextual reviews to build confidence in products they are planningto buy. At the same time, popular products have thousands ofuser-generated reviews. Current approaches to present them tothe user or recommend an individual review for a product arebased on the helpfulness or usefulness of each review. In thispaper we look at the top-k reviews in a ranking to give a goodsummary to the user with each review complementing the others.To this end we use Latent Dirichlet Allocation to detect latenttopics within reviews and make use of the assigned star ratingfor the product as an indicator of the polarity expressed towardsthe product and the latent topics within the review. We present aframework to cover different ranking strategies based on theuser’s need: Summarizing all reviews; focus on a particularlatent topic; or focus on positive, negative or neutral aspects.We evaluated the system using manually annotated review datafrom a commercial review Web site.Winner of best paper award at 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.QC 2012021

    Explainable AI under contract and tort law

    Get PDF
    This paper shows that the law, in subtle ways, may set hitherto unrecognized incentives for the adoption of explainable machine learning applications. In doing so, we make two novel contributions. First, on the legal side, we show that to avoid liability, professional actors, such as doctors and managers, may soon be legally compelled to use explainable ML models. We argue that the importance of explainability reaches far beyond data protection law, and crucially influences questions of contractual and tort liability for the use of ML models. To this effect, we conduct two legal case studies, in medical and corporate merger applications of ML. As a second contribution, we discuss the (legally required) trade-off between accuracy and explainability and demonstrate the effect in a technical case study in the context of spam classification.Peer Reviewe
    corecore